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Abstract 

f^) Multivariate methods have been recently introduced and successfully applied for the discrimination of signal from 
background in the selection of genuine very-high energy gamma-ray events with the H.E.S.S. Imaging Atmospheric 
Cerenkov Telescope. The complementary performance of three independent reconstruction methods developed for the 
H-E.S.S. data analysis, namely Hillas, model and 3D-model suggests the optimization of their combination through the 
^ ^application of a resulting efficient multivariate estimator. In this work the boosted decision tree method is proposed 
leading to a significant increase in the signal over background ratio compared to the standard approaches. The improved 
sensitivity is also demonstrated through a comparative analysis of a set of benchmark astrophysical sources. 

£S| Key words: Multivariate, Decision tree, H.E.S.S., 7-ray, Cerenkov, IACT 
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1. Introduction 

In the past decade, a new astronomical window has 
been opened thanks to the last generation of ground-based 
Imaging Atmospheric Cerenkov Telescopes (IACTs). Be- 
fore the construction of IACT arrays such as H.E.S.S., 
M.A.G.I.C. and V.E.R.I.T.A.S., only a few very-high en- 
ergy (VHE) 7-ray sources (>100 GeV) were known. This 
new generation of experiments has resulted in the dis- 
covery of many tens of galactic and extra-galactic 7-ray 
sources. 

The H.E.S.S. system is currently the most efficient in- 
strument to look at the inner part of the Galactic plane. 
The system is composed of four IACTs and provides a 
sensitivity to a 1% of Crab Nebula flux in around 25 h of 
observations [![. A systematic survey of about a third of 
the Galactic plane has been conducted since the beginning 
of the observations in full operation mode in 2004, lead- 
ing to the discovery of more than 50 sources within our 
Galaxy Q0. 

IACTs detect the Cerenkov light emitted by the sec- 
ondary particle showers generated by the interaction of the 
incoming 7-ray into the atmosphere. They face a dominant 
background due to the hadron induced showers in the re- 
search of 7-ray signal. Three alternative reconstruction 
and discrimination methods have been developed and ap- 
plied so far to the H.E.S.S. data analysis, namely Hillas, 
model and 3D-model. They have been individually im- 
proved and updated in the last years. The Xeff multivari- 
ate analysis method has been recently introduced Q in 
the H.E.S.S. data analysis, increasing the discrimination 
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power of genuine VHE gamma-ray event signals from the 
cosmic-ray background and improving the reconstruction 
performance (e.g. energy and direction reconstructions) 
through the combination of the three methods together. 
In this work the optimization of the multivariate analy- 
sis is presented through the application of a boosted de- 
cision tree (BDT) method leading to a single alternative 
discriminating estimator. After describing the methodol- 
ogy, some examples of application of the proposed multi- 
variate method are presented in order to demonstrate the 
achieved gain in terms of sensitivity and precision. 



2. Current methods used in H.E.S.S. data analysis 

The three shower reconstruction methods applied so far 
in the HESS data analysis are briefly described in this 
section. 



46 2.1. Hillas analysis 

47 This historical method has been introduced by M. Hillas 

48 in 1985 for single telescope analysis [B| and was the first 

49 method applied to the H.E.S.S. data analysis for multi- 

50 telescope images (hereafter Hillas method). The so 

51 called Hillas parameters of the shower are extracted by 

52 fitting an ellipse to the images. The dimensions (length 

53 and width) and orientation of the image on the focal 

54 plane (azimuthal angle, distance of the image barycen- 

55 ter to the camera center ...) are estimated from the fit. 

56 A charge measurement (in number of photo-electrons) is 

57 obtained from the total amplitude of the images. The di- 
ss rection of the incoming particle is estimated through the 
59 ellipse orientation, while the shower energy is estimated 
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with the total image amplitude and the reconstructed im-m 
pact parameter of the shower. The discrimination be-113 
tween hadron and 7-ray events is done through scalediw 
variables. The Hillas geometric parameters of the image, 115 
the length and width, are scaled with the mean valuesue 
and the dispersion obtained from Monte Carlo simula-m 
tions. The scaled variables are then averaged over the var-us 
ious triggered telescopes. They are often combined in aniw 
unique discriminating variable mean scaled sum defined asi2o 
(MeanScaledWidth + MeanScaledLength)/^. For morem 
information see [fj. 122 

123 

2.2. Semi- analytical model analysis 124 

This method was first developed by the CAT collabora- 125 
tion [7] and has been applied to H.E.S.S. data analysis 8] 126 
(hereafter model method). It is based on the comparison 127 
of the shower image with a shower prediction given by a 128 
semi-analytical model. The image is compared to the im- 129 
ages stored in a model look-up table and a log-likelihood 130 
minimization of the fit is done over all the available pix- 131 
els. The parameters of the most probable image give the 132 
primary particle energy and incoming direction. The dis- 133 
crimination between 7 rays and hadrons is achieved with a 134 
goodness-of-fit variable combined with the shower primary 135 
depth, which is a free parameter of the model. As for the 136 
Hillas analysis, this variable is often rescaled with the sim- 137 
ulation mean value and dispersion. Recent improvements 138 
of this analysis method have greatly increased the sensi- 139 
tivity. For more information see 140 

141 

2.3. 3D model analysis 

The 3D-model reconstruction is the third method de-142 
veloped for the analysis of H.E.S.S. data j§[ (hereafter 
model3D method). It consists of modeling the atmospheric 143 
shower as a Gaussian photosphere with anisotropic angular 144 
light distribution. This model is then used to predict the 145 
light collected in each pixel of the camera. Several shower 146 
parameters are extracted from the fit of the recorded im- 147 
age with the model prediction. The rotational symmetry 148 
of the shower with respect to the main axis can then be 149 
used to discriminate hadrons and 7 rays, through the re- 150 
duced 3D- width variable. For more information see 151 



3. The Boosted Decision Tree method 



153 

154 



The discrimination methods previously described are all 
based on simple cut based analysis techniques. Exten- 
sion of such techniques such as neural networks or decision 157 
trees, already used in a wide range of domains, have been 
introduced and applied in the field of high energy physics 
They have the main advantage to consider non-linear cor- 
relations between input parameters. The decision treesieo 
have the particularity to be insensible to the use of pa-i6i 
rameters without discrimination power. 162 

A decision tree is a decision support tool that uses ai63 
tree-like model of decision to separate two populations ini64 



terms of signal or background [1 1| | . Starting from the ini- 
tial event sample, a search for the best criterion among 
the discriminant variables is performed. The selection re- 
sults in two event samples that are submitted to the same 
procedure. Repeated binary selections are then performed 
on the subsequent event samples until some stop criterion 
is reached. When the splitting is stopped, the events from 
the extremal folders (which are called leaves) are classified 
in terms of signal or background likeliness according to the 
class the majority of events belongs to. The stop criterion 
is set in order to avoid a too efficient discrimination be- 
tween signal and background. The splitting could continue 
until the leaves contain only signal or background events, 
that would imply that the trees are overtrained. In order 
to avoid this problem, a pruning of the tree is necessary 
to remove the statistically insignificant nodes. 

The boosting process aims to stabilize the response of 
a tree and improve its performance. The BDT method 
consists of a forest of successive trees. Misclassified events 
in the previous tree are given a higher event weight on the 
following tree. In the most popular boosting technique, 
AdaBoost 12J, the following tree is trained with a modi- 
fied initial event sample where the weight of misclassified 
events is multiplied by a boost weight a. This weight is 
derived from the fraction of misclassified event err on the 



previous tree as a 



Once the forest has been 



defined, the signal or background likelihood of individual 
events is estimated applying the set of splitting of the var- 
ious trees, and is averaged over the forest according to 
weights, set to stabilize the decision procedure. 



4. Application to H.E.S.S. data 

Boosted decision trees have already been applied for the 
analysis of H.E.S.S. data to discriminate between showers 
generated by leptons and hadrons. It led to a ground- 
based measurement of the electron -f positron spectrum 
with H.E.S.S. [l3j]. Another analysis has been developed 
with a combination of variables derived from the Hillas- 
momcnt method. It showed a clear improvement of the 
hadron-7 rays discrimination compared to the standard 
Hillas analysis 14j. 

The aim of this work is to apply the BDT technique to 
the various methods of Cerenkov shower image analysis 
currently used by the H.E.S.S. Collaboration as described 
in section 2. The combination of these independent and 
complementary methods is expected to improve signifi- 
cantly the 7-ray hadron discrimination. In this section, 
the procedure followed in this work is described. 



159 J^.l. Training samples 

The 7-ray event sample used to train the BDTs has 
been taken from Monte Carlo simulations. The 7 rays 
have been simulated through the shower simulation code 
KASK ADE [lj| , with an impact parameter up to 550 me- 
ters from the array center. The zenith angle varies from 0° 
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Figure 1: Distributions of the variables used in the BDT method for a zenith angle ranging between 25° and 35° and an energy between 
500 GeV and 1 TeV. The blue filled and hatched red histograms are the simulated 7-ray event and background event distributions respectively. 
The four upper panels are the main variables coming from the original analysis methods: Mean Scaled Width, Mean Scaled Length, Rescaled 
Width 3D and Mean Scaled Goodness (from left to right). The lower panels include the additional variables: Primary Depth and the difference 
between reconstructed directions: Hillas, model & model3D (left to right). 



to 70°. The off-axis angle of the showers have been takemse 
from 0° to 2.5° from the camera axis by steps of 0.5° .187 
It corresponds to the actual field of view of the H.E.S.S.iss 
camera. i89 
The background event sample has been selected froim™ 
real H.E.S.S. events. The events have been chosen fromi 9 i 
extra-galactic observations in order to avoid a contamina-192 
tion by a potential diffuse 7-ray background or undetectedi 93 
galactic 7-ray sources. The events coming from knowni 94 
extra-galactic VHE 7-ray sources have been excluded andws 
the remaining events detected by H.E.S.S. in those obser-i 96 
vations are considered as background events. Most of theses? 
showers are generated by hadron cosmic rays or electrons. 19s 
The extra-galactic diffuse 7-ray background is usually cLS-199 
sumed to be negligible at TeV energies. 200 

201 

4.2. Input variables for combination 202 
Four variables are used: the mean scaled width and203 
length of the images from the Hillas method; the meamM 
scaled goodness from the model analysis; and the rescaled205 
width from the model3D analysis. The distribution of the2oe 
variables for simulated 7-ray and hadron events is shown207 



on the upper panel of figure [T] As expected, the 7-ray 
distributions are centered on the origin while the hadron 
distributions are shifted towards larger values. Moreover, 
it has been shown that these variables are almost not corre- 
lated for 7 rays and are partially correlated for hadrons [J] . 

Additionally to these variables, a set of 4 variables has 
been added to improve the discrimination. The primary 
interaction depth of the particle, scaled in term of pho- 
ton radiation length, has been shown to have a signif- 
icant different distribution for hadrons and 7 rays Q. 
Figure [1] shows the distribution of the primary interac- 
tion depth for hadrons and 7 rays estimated through the 
model method. It should be noted that this variable is 
obtained by comparison to simulated 7-ray showers and 
assumes a 7-ray nature of the event. The non convergence 
of the fit procedure for the background is responsible for 
the overflow cumulative bin observed in figure Q] (at -2). 
This variable is already used to operate an event prese- 
lection by the model and model3D analyses. Moreover, 
each reconstruction method gives a reconstructed direc- 
tion, which are not necessarily identical. It has been shown 
that the fluctuations of these reconstructions are bigger 
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208 for hadrons than for 7 rays [4] . An additional discrimina- 

209 tion can be thus achieved using the reconstruction differ- 

210 ences looking at two methods alternatively (A#Hillas— Model) 

211 A6*Hiiias-Modoi3D and A#Modd-Modei3D)- The distribution 

212 of the additional variables is shown in the lower panel of 

213 figure [T] for hadrons and 7 rays. The discrimination power 

214 of these variables is visible in this figure. 

215 J^.3. Training strategy 

216 The shapes of the particle shower and the Cerenkov im- 

217 age change with the particle energy. The distribution of 

218 the discriminating variables changes as well. The four vari- 

219 ables from the standard methods are scaled using look-up 

220 tables generated with MC simulations, as a function of 

221 energy and observing zenith angle. Although, since the 

222 variables are scaled, large variations with the energy and 

223 the observation conditions of their distribution are not 

224 expected. On the contrary, the second set of variables, 

225 including the primary depth and the difference in recon- 264 

226 structed directions, is dependent on the energy of the par- 265 

227 ticle and the observation conditions. 

228 The strategy followed to discriminate between hadrons 

229 and 7 rays is the same as in UM. The H.E.S.S. dynamical 

269 

230 range has been divided in six energy bands from 100 GeV 

270 

231 to 100 TcV. Both simulated and real events have been dis- 

232 tributed within these bands using the energy derived by 

233 the combined method already applied in the XEff analysis 

234 and described in |4|. This method combines the recon- 272 

235 structed energy derived from the three original methods 273 

236 and improves the energy resolution compared to the single 274 

237 analysis. As well, the zenith angle range (0°-70°) has been 275 

238 divided into seven bands. The BDTs have been trained 276 

239 separately within these energy and zenith angle bands. 277 

240 Due to trigger effects, the statistics within several low en- 278 

241 ergy bins at high zenith angle is very low and the corre- 279 

242 sponding events have been neglected: zenith angle larger 280 

243 than 35° and 52.5° respectively for the first and second 281 

244 energy bands (empty bins in figures U and [To]) . The train- 282 

245 ing and events selection have been done using the BDT 283 

246 method provided by the package for multivariate analysis 284 

247 TMVA The adaptative boosting method AdaBoost 285 

248 has been used. 286 

249 Several parameters can be modified in order to improve 

250 the method efficiency and to control its stability. A par- 287 

251 ticular attention has been brought to the control of the2ss 

252 over-training of the BDT. A too efficient classifying tree289 

253 can lead to bias effects. The BDT response has been con-290 

254 trolled with an independent event sample. The consistency29i 

255 of the training and test sample BDT distributions has been292 

256 checked for each zenith angle and energy bin. The param-293 

257 eters of the BDT trainings have been slightly modified294 

258 with respect to the default values optimised by the TMVA295 

259 developers. Their choice is the result of a compromise be-296 

260 tween hadron-7-ray discrimination efficiency and the ab-297 

261 sence of over-training. The tree forests are composed of298 

262 200 trees. The selection splitting has been done at the299 




BDT 

Figure 2: BDT response distribution for the events with a zenith an- 
gle ranging between 25° and 35° and a reconstructed energy between 
500 GeV and 1 TeV. The blue filled and hatched red histograms are 
the simulated 7-ray and background distributions respectively. 

node level performing 100 steps over the variable distri- 
butions. The separation between the populations is per- 
formed using the Gini Index criterion, defined as px (1 —p) 
where p = S ^ B is the purity of the sample (S and B are 
the signal and background events). Further splitting has 
been stopped when the number of events fell below 20. 
The tree pruning is performed using the cost complexity 
method [11| with a pruning strength set at 20. 

4-4- BDT response 

Figure [2] shows the results of the tests for the trained 
BDT with an independent test sample of 7-ray and back- 
ground events for one zenith angle and energy band. The 
discrimination power of this new variable is clearly visible 
when compared to the distributions of the original vari- 
ables. The rejection efficiency is clearly improved with 
respect to the original variables for all energies and zenith 
angle. Figure [3] shows for three case the receiver operator 
characteristic (ROC) diagram for the BDT method de- 
scribed in this paper compared to the main variable from 
the original Hillas method. For a given level of hadron 
rejection, the combined estimator allows to keep a more 
important fraction of 7-ray events. It shows the improve- 
ment in terms of hadron rejection possible through this 
combination of methods. 

4-5. Selection cuts determination 

Once the BDTs have been trained, a choice on the cut 
values on the estimator has to be made. Three optimiza- 
tion strategies have been applied depending on the source 
strength. The first set is dedicated to the analysis of strong 
sources such as the Crab Nebula. The second is defined 
for intermediate source fluxes of the order of 10% the Crab 
Nebula flux. The last one is optimized for faint source 
searches with flux of the order of 1% the Crab Nebula flux. 
The cut values have been chosen for these three sets within 
each zenith angle and energy band. The Crab Nebula has 
been used as a reference. The signal over background ratio 
has been estimated without any event selection within a 
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Figure 3: Background rejection efficiency as a function of the 7- 
ray efficiency (Receiver operator characteristic diagram). The BDT 
method presented in this paper is shown with the continuous line, 
while the mean scaled sum variable from the Hillas analysis is shown 
with the dotted line. The diagrams are for events with an energy and 
a zenith angle corresponding to: a) 100 GeV-300 GeV and 0°-15° b) 
500 GeV-1 TeV and 25°-35° c) 500 GeV-1 TeV and 52.5°-70°. 




0.1-0.3 0.3-0.5 0.5-1 1-2 2-5 5-100 

Energy [TeV] 




Energy [TeV] 



Figure 4: Top : Gamma-ray efficiency distribution over the zenith 
angle and energy bins for the faint source set of cuts. Bottom : Cor- 
responding background efficiency distribution. 



300 region of 0.11° around the position of the Nebula (stan- 314 

301 dard angular cut for point-like source). For every value 315 

302 of the BDT estimator, the corresponding 7-ray and back- 316 

303 ground event efficiencies have been applied to 100%, 10% 317 

304 and 1% of the measured Crab Nebula signal over back- 318 

305 ground ratio, for each set of cuts respectively. In each 

306 band, the BDT output value for which the significance of 319 

307 the signal S/-\J S + B is maximum has been chosen. S and 

308 B arc in this formula the events selected from the signaLs2o 

309 and background samples. 321 

310 The distributions of the 7-ray efficiency and background322 

311 efficiency for the chosen BDT value are shown in figure [4]in323 

312 the case of the faint source set of cuts. The average value of324 

313 the 7-ray and background efficiencies are 60% and 2% re-325 



spectively. These efficiencies show significant dependency 
on the energy and zenith angle. The optimization of the 
analysis implies a slightly lower 7-ray efficiency at high 
zenith angles while the background efficiency is increased 
at low energies. 

5. Systematic studies 

5.1. Comparison between Monte Carlo simulations and 
data 

The consistency between Monte Carlo simulations and 
real events has been checked. The MC-data consistency for 
the variables from the original methods has been tested in 
the frame of previous studies of these methods [1, @, 0; 
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Figure 5: BDT output distributions for Monte Carlo 7-ray events 
and real H.E.S.S. data with a reconstructed energy ranging between 
500 GeV and 1 TeV and a zenith angle between 25° and 35° . The 
blue histogram is the Monte Carlo test sample applied to the BDT 
forest. The points are the On-Off event distribution from the Crab 
Nebula, PKS 2155-304 and HESS J1745-290. 



326 they showed a good agreement of the simulations with 7~ 357 

327 ray candidate H.E.S.S. events. A good agreement has been 358 

328 observed for the four additional variables (primary depth 359 

329 and differences between reconstructed directions) in [4|]. 360 

330 The consistency of the BDT response between Simula- 361 

331 tion and real 7-ray events has been checked for the var- 362 

332 ious training. Several strong sources have been used to 363 

333 test this consistency: the Crab Nebula, PKS 2155-304 and 364 

334 HESS J1745-290. These sources have been observed within 365 

335 a large range of zenith angles. This allows to test the 366 

336 BDT response over the seven zenith angle bins. Further- 367 

337 more, the deep exposure and the brightness of observed 368 

338 sources provide enough statistics to check the MC-data 369 

339 consistency within all the energy bins. The BDT responses 370 

340 obtained with the simulations have been compared to the 371 

341 On source event distributions after subtraction of the Off 372 

342 source event distributions. The On-Off distribution corre- 373 

343 sponds to the 7-ray candidate events. In all the energy and 374 

344 zenith angle bins, a good agreement between the simulated 375 

345 7-ray and the On-Off distribution has been observed. Fig- 376 

346 ure[5]shows the simulation and On-Off data BDT response 377 

347 distribution for one bin. It shows the reliability and the 378 

348 robustness of this discriminating method. 

380 

349 5.2. Comparison with current analysis and published re- 381 

350 suits 

383 

351 Figure [5] features the ratio between the quality factor of384 

352 the BDT analysis and the quality factor from the Hillas385 

353 analysis (Qf = e 7 /y / ?h where e 7 and e n are the 7-ray386 

354 and hadron efficiencies respectively). The improvemcnt387 



Figure 6: The figure shows the ratio between the quality factor Qf 
of the present work over the quality factor from the Hillas analysis 
soft cuts, within the zenith angle and energy bins. 



in terms of discrimination is illustrated. This ratio ranges 
from 1.4 to 6.6 and shows that the BDT method greatly 
improves the rejection in all the energies and zenith an- 
gles. The figure shows also that the improvement is a fac- 
tor of the energy. As the original analyses are not energy 
dependant, they are mainly optimized for the lower ener- 
gies, where statistics are the most important. A energy 
binned analysis, such as the present BDT analysis, is thus 
very useful to increase the discrimination power at higher 
energies. The effect is illustrated in the figure: at lower 
energy, the BDT gives a better quality factor but the ma- 
jor increase is located at higher energy where it reaches for 
some bin a value around 6.6 times the Hillas quality fac- 
tor. A fraction of the improvement at high energy comes 
from the presence of the model and model3D methods, 
which are more efficient at high energy. However when 
compared to the model analysis, the quality factor ratio 
reaches values still larger than 5 at high energy. More- 
over, it should be noted that due to the falling power law 
nature of cosmic rays, the background statistic is limited 
at higher energy. The major improvement of the analysis 
is observed above 500 GeV where the ratio is higher than 
1.5. The method allows also to increase the discrimination 
power of the analysis at high zenith angle compared to the 
standard analysis. 

The performance has been checked with several VHE 
7-ray sources detected by H.E.S.S. which represent a wide 
range of sources in term of extension, background condi- 
tion (galactic or extra-galactic) and spectrum. Table [T] 
shows the results of the BDT analysis for these sources. 
These analyses have been made using the published data- 
sets. The residual background estimation has been per- 
formed using the reflected-region technique (for more de- 
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423 

Figure 7: Effective area for 7-ray collection after application of the 424 
strong source set of cuts of the BDT method. The effective area is 
computed for an azimuthal angle of 180° and an off-axis angle of 425 
0.5°. The black circles, blue squares and green triangles are for a«6 
zenith angle of 10° 30° and 55° respectively. The dashed line is the 427 
effective area for 7-ray collection with the Hillas method for a zenith 

' J 428 

angle of 55 . 

429 
430 

388 tails see The better performance of the BDT analysis43i 

389 compared to the original methods is illustrated on these432 

390 results. The significance of the excess is greatly increased433 

391 for all the sources. The signal over background ratio is, 434 

392 as expected, clearly increased. It allows a lower level ofi35 

393 background contamination for subsequent spectral studies436 

394 and thus sensitivity. 437 

438 

395 5.3. Spectral analysis 439 

396 Figure [7] shows the energy dependency of the photon 440 

397 effective area for the H.E.S.S. array as a function of the 441 

398 Monte Carlo simulated photon energy. The three curves 442 

399 are for a zenith angle of 10°, 30° and 55°. A major issue 443 

400 can come from the energy band optimization of the anal- 444 

401 ysis. The band cut optimization can lead to a band effect 445 

402 within the effective area and can generate systematic fake 446 

403 structures within the spectrum. There is no such kind of 447 

404 effects visible on figure [7] neither on the other zenith angle 448 

405 bins. 449 

406 The consistency of the analysis and the associated spec- 450 

407 tral analysis have also been verified on the VHE 7-ray 451 

408 source list from the previous section. A pure power law 452 

409 has been fitted to the data. Table [5] summarizes the spec- 453 

410 tral results obtained with the BDT analysis. They are 454 

411 compared to the published values. On all these reference 

412 sources, the BDT method gives consistent results with the 455 

413 published results. The BDT analysis allows to extend the456 

414 energy range of the fit. Due to the increased rejections? 



Source 


P 11T oof 

^UL St! I 

(Crab S/B) 


1 


< J ) i>200GeV 




0.5% 


2.30 ± 0.07 


4.5 ± 0.4 


Go.9+0.1 


1% 


2.30 ± 0.07 


4.5 ± 0.4 




2% 


2.30 ± 0.07 


4.4 ± 0.4 




10% 


2.30 ± 0.07 


4.4 ± 0.4 




100% 


2.32 ± 0.06 


4.7 ± 0.3 



Table 3: Variation of spectral index from the fit under variations 
of the set of cuts for GO.9+0.1. The integrated flux over 200 GeV 
5 > i>200GoV is expressed in unit of 10 — 12 photons cm - 2 s _1 . 



power from the variable combination and the fact that the 
optimisation of the analysis has been achieved for several 
bins in energy, an improvement of the analysis over the 
full energy range is observed. While a slight decrease of 
the energy threshold is observed, the gain is particularly 
important at higher energy where it has been shown on 
figure [6] that the discrimination has been greatly improved 
compared to the original methods. For instance, the en- 
ergy range of the fit for the source H2356-309 has been 
particularly broaden at higher energies. The increase in 
effective area and discrimination results in keeping after 
selection three 7-ray events between 3 TeV and 12 TeV, 
that are considered as background events with the other 
methods. Additionally, a fit of the 7-ray spectrum in the 
published energy range has been performed for all these 
sources and gives consistent results both with the full range 
BDT spectra and the published spectra. 

The stability of the analysis in terms of spectral recon- 
struction has been tested. A variation of the three sets of 
cuts has been made around the nominal values. A set 
of cuts optimized for a signal over background equiva- 
lent to 0.5% and 2% Crab Nebula has been defined for 
the faint source set of cuts, as well as sets at 5% and 
10%, and 50% and 200%, respectively for the intermediate 
and strong source set of cuts. The stability of the results 
has been tested under these cut modifications. The tests 
have been performed on references sources representative 
of faint, intermediate and strong sources. The spectral 
results obtained with the modified cuts are in very good 
agreement with those obtained with the nominal sets of 
cuts. Table [3] summarizes the spectral results obtained for 
the faint source set of cuts. An additionnal test has been 
performed, applying the three set of cuts (faint, interme- 
diate and strong source) on these three sources, whatever 
their strength. The results of this test is illustrated on ta- 
ble [3] The 7-ray event statistics is indeed modified by the 
choice of cuts, but the spectral results remains unchanged. 
The spectral results obtained with the BDT appear very 
robust under cut variations, whatever the set of cuts cho- 
sen. 

5.4- Morphological analysis 

The energy and direction of the selected gamma events 
are the combination of their corresponding estimates from 
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Centaurus A 


Hillas Hard Cuts 


4199 


3869 
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5.0 


0.1 




BDT 


1517 


1146 


371 


10.0 


0.3 


1ES 0347-121 


Hillas Hard Cuts 


1167 


840 


327 


10.1 


0.4 




BDT 


874 


500 


374 


14.3 


0.7 


1ES 1101-232 


Hillas Soft Cuts 


4276 


3623 


649 


10.1 


0.1 




BDT 


1399 


813 


586 


17.9 


0.7 


H2356-309 


model3D 


1706 


1261 


453 


11.6 


0.4 




BDT 


1631 


932 


699 


19.6 


0.8 


Crab Nebula 


Hillas Published 


4759 


483 


4283 


94.2 


8.8 




Model 


10079 


2634 


7293 


99 


2.7 




Model3D 


7460 


1573 


5958 


99 


3.8 




BDT 


6292 


244 


6048 


147.1 


24.8 



Table 1: Results obtained with the BDT analysis for various VHE 7-ray sources compared to the standard Hillas analysis or the published 
analysis. A comparison of the BDT analysis with the three reconstruction methods is given for the Crab. Column description: 1 On events 
2 Normalised Off events 3 7-ray candidates 4 Excess significance 5 Signal over background ratio. 



Source 


Method 


E ■ 


Emax 


r 


$0 


E cu t 


GO.9+0.1 


Pub - Hillas 


200 GeV 


9 TeV 


2.40 ± 0.11 


(5.7 ± 0.7)xl0- 12 t 






BDT 


160 GeV 


12 TeV 


2.30 ± 0.07 


(4.5 ± 0.4) xlO" 12 t 




Centaurus A 


Pub - Hillas 


250 GeV 


6 TeV 


2.73 ± 0.45 


(2.45 ± 0.52) xl0~ 13 






BDT 


200 GeV 


12 TeV 


2.71 ± 0.14 


(2.32 ± 0.27)xl0- 13 




1ES 0347-121 


Pub - Hillas 


250 GeV 


3 TeV 


3.10 ± 0.23 


(4.52 ± 0.85)xl0- ia 






BDT 


200 GeV 


4 TeV 


3.27 ± 0.17 


(3.69 ± 0.71)xl0~ 13 




1ES 1101-232 


Pub - Hillas 


200 GeV 


4 TeV 


2.94 ± 0.20 


(5.63 ± 0.89)xl0- ia 






BDT 


160 GeV 


8 TeV 


3.05 ± 0.12 


(4.65 ± 0.54) xl0~ 13 




H2356-309 


Pub - model3D 


200 GeV 


1.1 TeV 


3.09 ± 0.24 


(3.00 ± 0.80)xl0- ia 






BDT 


160 GeV 


12 TeV 


3.17 ± 0.11 


(3.29 ± 0.40) xl0~ 13 




Crab Nebula 


Pub - Hillas 


450 GeV 


65 TeV 


2.41 ± 0.04 


(38.4± 0.9)xl0~ 12 


15.1 ± 2.8 




model 


420 GeV 


80 TeV 


2.41 ± 0.04 


(38.2 ± 0.5) xlO" 12 


10.3 ± 2.2 




model3D 


520 GeV 


75 TeV 


2.35 ± 0.05 


(35.2 ± 0.8)xl0- 12 


12.3 ± 2.3 




BDT 


430 GeV 


45 TeV 


2.48 ± 0.04 


(39.0 ± 0.6)xl0~ 12 


13.8 ± 2.8 



Table 2: Results of the spectral analysis performed on various VHE 7-ray sources, compared to the published values [lSlllSl. 20. 2f , 22], The 
energy range of the fit is indicated in the first and second column, as well as the fit best parameters in the following columns. The last column 
is the differential flux at f TeV (in unit of photons cm - 2 s — ^^TeV - 1 ). ' for GO.9+0.1, the last column is the integrated flux over 200 GeV (in 
unit of photons cm"^" 1 ). 
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Figure 8: On-Off event distribution after selection from PKS 2155- 504 
304. The zenith angle of the shown data set ranges between 10° and 
50°. The upper figure is obtained with the standard Hillas analysis. 
The lower figure is obtained with the boosted decision tree analysis 505 
described in this work. The arrows on the figures indicate the radius 
including 90% and 95% of the 7-ray signal. The dashed lines are the 
point spread functions obtained from Monte-Carlo simulations and 
corresponding to the analysis. 507 



each of the three standard reconstruction methods. The 511 
approach applied here, which takes into account the covari- 512 
ance matrices between estimates, is the same as already 
applied in Xeff (see [4j and reference therein for more de- 
tails). It has been shown that this method gives more ac- 513 
curate reconstruction and improves the angular resolution 5 i 4 
of the H.E.S.S. data analysis. An improved discrimination 515 
helps also improving the angular resolution. Figure [5] illus- 516 
trates the benefits of the improved discrimination and the 518 
combined reconstruction to the On-Off event distributions^ 
for PKS 2155-304. This very bright point-like source is a 520 
good candidate to test the impact of the analysis method 521 

522 

on the analysis angular resolution. The On-Off distribu- 523 
tions are compatible with the point spread function of thes24 



instrument of the respective analysis. This distribution 
can be approximated by the sum of two, one-dimensional 
Gaussian functions. Using the fit of this sum on the distri- 
bution, the 68% containment radius of the signal is 0.11° 
for the Hillas analysis and is reduced to 0.07° with the 
BDT method. 

6. Summary 

The discrimination between 7-ray events and hadron in- 
duced background events is a key issue for ground based 
Cerenkov telescopes such as H.E.S.S.. A multi-variate 
analysis based on boosted decision trees has been stud- 
ied. Three analysis methods are currently at work for the 
analysis of H.E.S.S. data. The main discriminating vari- 
ables from these original methods have been combined. 
The discrimination has been increased including the dif- 
ference between the reconstructed direction of the various 
methods. The boosted decision trees have been trained in 
several bands in zenith angle and reconstructed energy in 
order to improve the rejection all over the energy range 
of the experiment and in all the observation conditions. 
This leads to a sizable improvement of the sensitivity. 
The signal over background ratio is dramatically increased 
compared to the original methods. The method has been 
tested on several reference sources which represent a wide 
range of sources in term of extension, nature, and observa- 
tions conditions. The improvement in term of signal over 
background ratio and significance of the sources is illus- 
trated. The application of this methods results also in a 
broader energy range for the spectral fit of faint sources 
compared to the previous methods. The robustness of the 
analysis in term of spectral reconstruction has been also 
demonstrated. The improved discrimination brings also a 
substantial gain in the angular resolution of the analysis. 
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